I mean, it’s one experiment Michael. How long could it take? 10 minutes?
1 Intro
dunno what to say here yet…..
2 Setup
We need to do some preliminary setup for the theory and modelling. Let us assume a distribution of participants (a participant pool, if you wish) \(F\) over a time variable \(t\in\mathbb{R}^+\).
3 Theory
3.1 A degenerate case
Start with a single participant. how long do they take? The maximum value is just the value of this single person, so the expected maximum is the expected time it takes this person to complete the experiment. If this person is chosen at random from the distribution \(F\), then the expected maximum is just \(E(F)\).
3.2 A simple case of high and low
Lets simplify the domain of the time distribution to only 2 possible equally likely states: Short time \(\tau_s\); Long time \(\tau_h\). Assume, obviously, that \(\tau_s<\tau_h\). In this case, the maximum value in any random sample of size N will be \(\tau_h\) unless every observation in the sample is equal to \(\tau_s\). This is actual quite trivial, as this is the only possible outcome which does not contain \(\tau_h\). So, for N=2, the possible outcomes are \(\{(\tau_s, \tau_s), (\tau_s, \tau_h), (\tau_h, \tau_s), (\tau_h, \tau_h)\}\). In three of the four outcomes the maximum value is \(\tau_h\). This implies that the expected maximum value is three-quarters of the way between the two times: \(\frac{1}{3}\tau_s+\frac{3}{4}\tau_h\). The general equation for this is easy to get to, as the coefficient for \(\tau_s\) is always one divided by the number of possible outcomes, i.e. \(\frac{1}{N^2}\). Therefore, to summarize, With only 2 finite states \(\tau_s\) and \(\tau_h\), which can occur with equal probability, then the expected maximum value with N participants is:
\[ E(\tau_{max})=\frac{1}{N}\tau_s+\frac{N-1}{N}\tau_h \]
We should next relax the probability assumption, such that the probability of a participant taking a short time to complete the experiment is \(p_s\). We can then easily work out the probability of each possible outcome and use these as weights to construct the expected maximum. The probabilities of the possible outcomes \(\{(\tau_s, \tau_s), (\tau_s, \tau_h), (\tau_h, \tau_s), (\tau_h, \tau_h)\}\) are \(\{p_s^2, p_s(1-p_s), (1-p_s)p_s, (1-p_s)^2\}\), respectively. This means we can construct a slightly more realistic expected maximum:
\[ E(\tau_{max};N,K=2)=p_s^N\tau_s+(1-p_s^N)\tau_h \]
3.3 Adding Time
Before moving towards a general expression for the expected maximum time for a continuous time variable, let us just add some discrete options first. Let the number of discrete times be \(K\in\mathbb{N}\). First assume \(K=3\), thus the distribution \(F\) is some probability distribution over \(\{\tau_1, \tau_2, \tau_3\}\), where \(\tau_1<\tau_2<\tau_3\). Let this probability of \(\tau_k\) be denoted as:
\[ Pr(\tau_k)=p_k\qquad 1\le k\le K \]
This means that, similar to before, each possible time outcome has a probability associated, thus each outcome is the joint probability of the time outcomes (assuming independence). As for the possible outcomes, as we have \(N\) participants who can each have 1 of a possible \(K\) times, the total number of possible outcomes is \(NK\). Thus, with \(N=2, K=3\), there are 9 \((K^N)\) possible outcomes \(\{(\tau_1, \tau_1), (\tau_1, \tau_2), (\tau_1, \tau_3), (\tau_2, \tau_1), (\tau_2, \tau_2), (\tau_2, \tau_3), (\tau_3, \tau_1), (\tau_3, \tau_2), (\tau_3, \tau_3)\). In only 1 of these outcomes is the maximum time \(\tau_1\), in 3 of the outcomes the maximum time is \(\tau_2\), and in the remaining 5 outcomes the maximum is \(\tau_3\). The expected maximum time for \(N=2, K=3\) is therefore:
\[ E(\tau_{max};N=2,K=3)=p_1^2\tau_1+(2p_1p_2+p_2^2)\tau_2+(1-(p_1+p_2)^2)\tau_3 \]
This quickly can become quite complex, but let’s look at when \(N=3\), where we will have 27 possible outcomes. As always, in only 1 of the outcomes \(\{1\times(\tau_1,\tau_1,\tau_1)\}\) is the maximum value \(\tau_1\), the maximum value is \(\tau_2\) in 7 outcomes \(\{3\times(\tau_1,\tau_1,\tau_2)+3\times(\tau_1,\tau_2,\tau_2)+1\times(\tau_2,\tau_2,\tau_2)\}\), and the remaining 19 outcomes the maximum value is \(\tau_3\). The expected maximum value in this case is therefore:
\[ E(\tau_{max};N=3,K=3)=p_1^3\tau_1+(3p_1^2p_2+3p_1p_2^2+p_2^3)\tau_2+(1-(p_1+p_2)^3)\tau_3 \]
One can speculate as to what the general case for \(K=3\) is from the above equations. The coefficient for \(\tau_1\) will always be \(p_1^N\), which is pretty self explanatory, as there is only one possible outcome where \(\tau_1\) is the maximum value. The coefficient for \(\tau_3\) will always be \(1-(p_1+p_2)^N\). To explain why, this coefficient is the probability that an outcome will contain at least one \(\tau_3\), which is equivalent to 1 minus the probability that the outcome does not contain any \(\tau_3\). In order to not contain \(\tau_3\), all players must be either \(\tau_1\) or \(\tau_2\). This probability of this is \((p_1+p_2)^N\). The coefficient for \(\tau_2\) can be found as the compliment to the other coefficients \((p_1+p_2)^N-p_1^N\). Therefore, the general case for \(K=3\) is:
\[ E(\tau_{max};N,K=3)=p_1^N\tau_1+((p_1+p_2)^N-p_1^N)\tau_2+(1-(p_1+p_2)^N)\tau_3 \]
4 The effect of N
One important question here is about how the expected maximum time changes with the number of participants. If your intuition is that it increases, then congrats, this is the case. The following subsections will explain.
4.1 K=2
As found previously, the expected maximum time for only 2 possible states of time is:
\[ E(\tau_{max};N,K=2)=p_1^N\tau_1+(1-p_1^N)\tau_2 \qquad\tau_1<\tau_2 \]
It is easy to see that this is increasing in \(N\), as the probability of the lower state of time is less than 1 \(0\le p_1\le1\), therefore taking the power \(N\) times will only lead to the weight on \(\tau_1\) to fall, therefore meaning the weight on \(\tau_2\) rises with \(N\). Thus, as one increases the number of participants the expected maximum time taken of one of these participants will also increase. So, in this highly simplified model, one should expect experiments to take longer the greater the number of participants in said session.
4.2 K=3
Similarly, the expected maximum time for 3 possible states of time is:
\[ E(\tau_{max};N,K=3)=p_1^N\tau_1+((p_1+p_2)^N-p_1^N)\tau_2+(1-(p_1+p_2)^N)\tau_3 \]
One does not need to differentiate this to see that this too is increasing in \(N\). Both coefficients for \(\tau_1\) and \(\tau_2\) are decreasing in \(N\), and the coefficient for \(\tau_3\) is clearly increasing in \(N\). This implies that the overall expected maximum when \(K=2\) must be increasing in the number of participants in the sample \(N\).
4.3 K=4
With more time options comes more possible outcomes. With 2 participants there are 16 possible outcomes. The expected maximum can be given as:
\[ E(\tau_{max};N=2,K=4)=p_1^2\tau_1+(2p_1p_2+p_2^2)\tau_2+(6p_1p_2p_3+3p_1^2p_3+3p_2^2p_3+3p_1p_3^2+3p_2p_3^2+p_3^3)\tau_3+(1-(p_1+p_2+p_3)^2)\tau_4 \]
This can be simplified as:
\[ E(\tau_{max};N=2,K=4)=p_1^2\tau_1+(2p_1+p_2)p_2\tau_2+(3(p_1+p_2)^2+3p_1p_3+3p_2p_3+p_3^2)p_3\tau_3+(1-(p_1+p_2+p_3)^2)\tau_4 \]
We can increase \(N\) to 3 without too much added complication, despite their being 64 possible outcomes. This is due to the simplicity of 3 of the 4 probabilities. The probability of the max being the lowest time is always the probability of every participant being the lowest time \((p_1^N)\). The probability of the max being the second lowest is the sum of the probability that at least 1 participant is the second smallest time. This is also the probability of any combination of the 2 lowest times, minus the probability that they are all the smallest time \([(p_1+p_2)^N-p_1^N]\). Likewise for the largest time being \(\tau_3\), this is probability of any outcome where all participants have a time no higher than \(\tau_3\) minus the probability of an outcome where all participants have a time no higher than \(\tau_2\). This is \((p_1+p_2+p_3)^N-(p_1+p_2)^N\). Therefore the expected maximum for any \(N\) when \(K=4\) is:
\[ E(\tau_{max};N,K=4)=p_1^N\tau_1+[(p_1+p_2)^N-p_1^N]\tau_2+[(p_1+p_2+p_3)^N-(p_1+p_2)^N]\tau_3+[1-(p_1+p_2+p_3)^N]\tau_4 \]
4.4 Any K
A pattern is emerging which allows us to form a general equation for the expected maximum for any value of both \(N\space and\space K\):
\[ E(\tau_{max},N,K)=\sum_{k=0}^{K-1}[(\sum_{s=0}^{k}p_{s+1})^N-(\sum_{s=0}^{k-1}p_{s+1})^N]\tau_{k+1} \]
Note that when \(k=0\) (i.e. the first term is all of the above expected max times) the second of the inner sums is degenerate and equal to zero, as \(s\) is outside the range of \(k\).
5 Continuous time
The previous equation, whilst useful, only applies to discrete time periods, which does not reflect exact reality. One could increase to number of periods to better match reality, but in order to truly reflect real time, we need to construct an equivalent equation for continuous time.
Firstly, one way of commonly approximating continuous counterparts to summations is to make the width of each component in the sum zero. If one notes that a summation is an approximation of the area under a curve by the sum or areas of rectangles, then the approximation will improve in accuracy the more rectangles we have. This would be the equivalent to allowing \(K\) to tend to infinity, and thus the width of these rectangles to tend to 0. Suppose the time taken follows some function \(T:k\rightarrow \tau\), and assume some probability density function \(f:\tau\rightarrow p\), then the equation would be something like this:
\[ E(\tau_{max},N,K=\infty)=\sum_{k=0}^\infty[(\sum_{s=0}^{k}p_{s+1})^N-(\sum_{s=0}^{k-1}p_{s+1})^N]T(k) \]
\[ E(\tau_{max},N,K=\infty)=\sum_{k=0}^\infty[(\sum_{s=0}^{k}f(T(s+1)))^N-(\sum_{s=0}^{k-1}f(T(s+1)))^N]T(k) \]
The problem here is that the difference between the two summations within the infinite sum tend to zero as the difference between \(p_{k+1}\) and \(p_k\) tends to zero. There is a clear continuous analog to these summations though, which is a definite integral:
\[ \sum_{s=0}^k f(T(s+1))\approx\int_0^k f(T(s+1))ds=F(T(k+1)) \]
Therefore, the continuous version of the expected maximum time intuitively follows as:
\[ E(\tau_{max},N)=\int_0^\infty \bigg[(\int_0^k f(T(s+1))ds)^N-(\int_0^{k-1} f(T(s+1))ds)^N\bigg]T(k)dk \]
\[ E(\tau_{max},N)=\int_0^\infty [F(T(k+1))^N-F(T(k))^N]T(k)dk \]
At this point, we need to deal with \(k\), as this is just an index on out time axis, which in reality is now continuous. Luckily, what times we are concerned with is \(T(k)\) and a time infinitesimally larger than \(T(k)\). Therefore, denote \(T(k)\) as \(t\) and \(T(k+1)\) as \(t+\epsilon\), where \(\epsilon\) is very small. If we fudge the maths slightly (as \(k\) is an integer and \(t\) is a real number) by allowing \(T(0)=0\), \(T(\infty)=\bar{t}\), therefore \(T(k)=\bar{t}g(k)\) where \(g(0)=0\) and \(g(\infty)=1\), and \(dt=g'(k)dk\).
\[ E(\tau_{max},N)=\int_0^\bar{t} [F(t+\epsilon)^N-F(t)^N]\frac{t}{\bar{t}g'(k)} dt \]
Can this be simplified? Maybe, perhaps there is a differential in there when \(\epsilon\rightarrow0\).
5.1 New approach
Instead of trying to transform an equation from discrete to continuous time mathematically, it may be more profitable to find the continuous time version intuitively. Let’s start with some parameters: let time be denoted as \(t\in\mathbb{R}\) and let some maximum time be denoted as \(\bar{t}\), which can be be finite or infinite; let \(f\) be the probability density function over \(t\), which only takes positive values in the interval \([0,\bar{t}]\), and zero elsewhere; and let \(F\) be the corresponding cumulative density function over \(t\), satisfying \(F(0)=F(-\infty)=0\) and \(F(\bar{t})=F(\infty)=1\).
Each coefficient for a time \(t\) in the discrete case is difference between the probability of observing a time \(t\) equal to or smaller than \(\tau\) \((Pr(t\le \tau)=F(\tau))\), and the probability of observing a time strictly less than \(\tau\) \((Pr(t\lt \tau)=F(\tau-\epsilon))\), all taken to the power of \(N\). This is easy to do with a finite amount of outcomes, as this difference is clear, but with continuous variables this difference is so small it is indistinguishable from zero. This can be written as \(\alpha_t\) below, where \(\epsilon\) is some very small number:
\[ \alpha_{\tau}=[F(\tau)-F(\tau-\epsilon)]^N \]
It would then follow that the expected maximum time is the sum of the product between these coefficients and their respective times. Using continuous variables, this is no longer a sum but instead an integral over the relevant time interval \([0,\bar{t}]\):
\[ E(\tau_{max};N)=\int_0^\bar{t}\alpha_{\tau}\tau d\tau=\int_0^\bar{t}[F(\tau)-F(\tau-\epsilon)]^N\tau d\tau \]
The question then still remains is whether this is possible to compute? If we allow \(\epsilon\rightarrow0\), then we have a difference of zero. Even if we try to look at this as a potential differential, it cannot help:
\[ E(\tau_{max};N)=\int_0^\bar{t}[\frac{F(\tau)-F(\tau-\epsilon)}{\epsilon}\epsilon]^N\tau d\tau= \int_0^\bar{t}[f(\tau)\epsilon]^N\tau d\tau \]
Instead we need to take a different approach from order statistics.
5.2 Yet another new approach
Let \(X\) be a sample of \(N\) times indexed by \(i\), such that \(\max(X)=X_{max}\). Let the pdf and cdf for \(X_i\) still be \(f\) and \(F\), respectively. The probability that \(X_{max}\le x\) is the probability that all \(X's\) in the sample are less than \(x\), i.e.:
\[ \Pr(X_{max}\le x)=\Pr(X_1\le x)\Pr(X_2\le x)...\Pr(X_N\le x)=[\Pr(X_i\le x)]^N=[F(x)]^N \]
In order to find the expected value of \(X_{max}\) we first need to find it’s pdf. By differentiation of the above probability we get:
\[ f_{X_{max}}(x)=\frac{d}{dx}[F(x)]^N=NF(x)^{(N-1)}f(x) \]
Therefore, one can construct a expected maximum for the sample \(X\) as:
\[ E(\max(X))=N\int_{-\infty}^\infty F(x)^{N-1}f(x)xdx \]
We can simplify this for our case as we have min and max on possibilities of time \(x\in[0,\bar{x}]\):
\[ E(\max(X))=N\int_0^\bar{x} xF(x)^{N-1}f(x)dx \]
Unfortunately, this has to analytical solution, thus can only really be further evaluated with examples. Before we do that though, we can answer one question: does this increase with N? Let’s differentiate using the Leibniz rule:
\[ \frac{d}{dN}(E(\max(X))=\int_0^\bar{x}xf(x)\frac{\partial}{\partial N}(NF(x)^{N-1})dx =\int_0^\bar{x}xf(x)F(x)^{N-1}(\ln(F(x))N+1)dx \]
This is probably less integratable than before, so who knows if this is positive; probably is; should be; who knows?
6 Examples
We need a PDF and CDF that satisfy the appropriate conditions: namely \(F(0)=0\) and \(F(\bar{x})=1\). The later condition may be hard to satisfy, but is less important as the time axis can extend out to infinity but with extremely low probability density. The rationale for a finite time limit is for real-life appropriateness, but isn’t vital to satisfy. Some popular distributions are the exponential, chi-sq and log-logistic:
\[ f(x;\lambda)=\begin{cases}\lambda e^{-\lambda x} &\ x\ge0 \\ 0 &\ x\lt0 \end{cases} \qquad F(x;\lambda)=1-e^{-\lambda x}\qquad(exponential) \]
\[ f(x;k)=\begin{cases} \frac{x^{\frac{k}{2}-1}e^{-\frac{x}{2}}}{2^{k/2}\Gamma(k/2)} &\ x\gt0 \\ 0 &\ otherwise \end{cases} \qquad F(x;k)=\frac{1}{\Gamma(k/2)}\gamma(\frac{k}{2},\frac{\pi}{2}) \qquad (chi-squared) \]
\[ f(x;\alpha,\beta)=\frac{\beta/\alpha(x/\alpha)^{\beta-1}}{(1+(x/\alpha)^\beta)^2} \qquad F(x;\alpha,\beta)=\frac{1}{1+(x/\alpha)^{-\beta}}=\frac{x^\beta}{\alpha^\beta+x^\beta} \qquad (Log-logistic) \]
Of these, the chi-squared is cumbersome, so let’s focus on the other two. The simplest is the exponential distribution, so start there.
6.1 Exponential Distribution
Given the above definitions, the expected maximum using the exponential distribution will be given as:
\[ E(\max(X))=\lambda N\int_0^\bar{x} x(1-e^{-\lambda x})^{N-1}e^{-\lambda x}dx \]
Can this be integrated? I don’t think so, but let’s see how it changes with N. Through the Leibniz rule, the differential is given by:
\[ \frac{d}{dN}(E(\max(X))=\int_0^\bar{x}xe^{-\lambda x}\frac{\partial}{\partial N}(N(1-e^{-\lambda x})^{N-1})dx =\int_0^\bar{x}xe^{-\lambda x}[(N+\ln(1-e^{-\lambda x}))(1-e^{-\lambda x})^{N-1}]dx \]
6.2 Log-Logistic
Likewise the expected max under the Log-logistic
\[ E(\max(X))=N\int_0^\bar{x} (\frac{x^\beta}{\alpha^\beta+x^\beta})^{N-1} \frac{\beta/\alpha(x/\alpha)^{\beta-1}}{(1+(x/\alpha)^\beta)^2}dx =N\beta\alpha^\beta\int_0^\bar{x} \frac{x^{N\beta}}{(\alpha^\beta+x^\beta)^{N+1}}dx \]
This looks more integratable, is it? Well, if it isn’t let’s find the integral with respect to \(N\).
\[ \frac{d}{dN}(E(\max(X))=\beta\alpha^\beta\int_0^\bar{x}\frac{\partial}{\partial N} (\frac{Nx^{\beta N}}{(\alpha^\beta+x^\beta)^{N+1}})dx \]
\[ =\beta\alpha^\beta\int_0^\bar{x} \frac{x^{\beta N}}{(\alpha^\beta+x^\beta)^{N+1}} + \frac{\beta Nx^{\beta N}\ln(x)}{(\alpha^\beta+x^\beta)^{N+1}} - \frac{Nx^{\beta N}\ln(\alpha^\beta+x^\beta)}{(\alpha^\beta+x^\beta)^{N+1}} dx \]
6.3 Fréchet Distribution
This distribution is relatively simple with the PDF and CDF as follows:
\[ f(x;\alpha)=\alpha x^{-(1+\alpha)}e^{-x^{-\alpha}} \]
\[ F(x;\alpha)=e^{-x^{-\alpha}} \]
Thus, with this choice of distribution, the expected maximum would be given as:
\[ \alpha N\int_0^\bar{x} x^{-\alpha}e^{-Nx^{-\alpha}} dx = \bigg[N^{\frac{1}{\alpha}}\Gamma\bigg(\frac{\alpha-1}{\alpha},\frac{N}{x^\alpha}\bigg)\bigg]_0^\bar{x} \]
This looks to be the first distribution that actually is integratable, but even this is somewhat incomplete as the answer includes an incomplete gamma function. This is described as an integral, so we can’t take this process any further than this. It does, however, mean that we can find the derivative. Except i don’t want to as it also has the incomplete gamma function due to having to use the product rule. i.e: \[ \dfrac{N^{\frac{1}{a}-1}\mathrm{e}^{-\frac{N}{x^a}}\cdot\left(\operatorname{\Gamma}\left(\frac{a-1}{a},\frac{N}{x^a}\right)\,\mathrm{e}^\frac{N}{x^a}-a\cdot\left(\frac{N}{x^a}\right)^\frac{a-1}{a}\right)}{a} \]
7 Intuitive Reasoning
Maths hasn’t been able to show the expected maximum increasing with \(N\), but thankfully a little intuitive logic should. Recall the equation for the expected maximum:
\[ E(\max(X))=N\int_0^\bar{x} xF(x)^{N-1}f(x)dx \]
Can we say if this is increasing in N, whilst there are counteracting forces: The \(N\) at the beginning acting as a scalar certainly increases the expected maximum; but the power \(N-1\) on the CDF in the integral will decrease the integrand (as the CDF<1).